Consolidating Industrial Small Files Using Robust Graph Clustering
نویسندگان
چکیده
Small file management is widely encountered in industrial areas. Consolidating small files can benefit the performance of data system. Many existing consolidation solutions fail to realize importance a proper schema. Therefore, they use very primitive and ineffective schemas. In this paper, we focus on proposing an effective robust Unlike most that only historical workload, consider workload uncertainty issue propose graph-clustering-based solution more future. To do this, introduce optimization, mathematical model provides theoretical support for solving issues. Then, demonstrate robustness schema be achieved using graph clustering algorithm with duplication mechanism. Since leads redundancy, parameter control redundancy We also two algorithms estimate automatically. Experimental results both synthetic real-life sets show effectiveness our algorithm.
منابع مشابه
Robust Synchronization-Based Graph Clustering
Complex graph data now arises in various fields like social networks, protein-protein interaction networks, ecosystems, etc. To reveal the underlying patterns in graphs, an important task is to partition them into several meaningful clusters. The question is: how can we find the natural partitions of a complex graph which truly reflect the intrinsic patterns? In this paper, we propose RSGC, a n...
متن کاملRobust Spectral Clustering Using Statistical Sub-Graph Affinity Model
Spectral clustering methods have been shown to be effective for image segmentation. Unfortunately, the presence of image noise as well as textural characteristics can have a significant negative effect on the segmentation performance. To accommodate for image noise and textural characteristics, this study introduces the concept of sub-graph affinity, where each node in the primary graph is mode...
متن کاملRobust Statistics and Fuzzy Industrial Clustering
The search for groups of important sectors in an economy has been and still is one of the more recurrent themes in input-output analysis. But a sector can probably be important for some questions at the same time, to a different degree. In this direction, a multidimensional fuzzy clustering analysis gives as a result a classification of sectors illustrating the different roles that each one pla...
متن کاملUsing clustering strategies for creating authority files
As more online databases are integrated into digital libraries, the issue of quality control of the data becomes increasingly important, especially as it relates to the effective retrieval of information. Authority work, the need to discover and reconcile variant forms of strings in bibliographic entries, will become more critical in the future. Spelling variants, misspellings, and transliterat...
متن کاملLearning Robust Graph Regularisation for Subspace Clustering
Various subspace clustering methods have benefited from introducing a graph regularisation term in their objective functions. In this work, we identify two critical limitations of the graph regularisation term employed in existing subspace clustering models and provide solutions for both of them. First, the squared l2-norm used in the existing term is replaced by a l1-norm term to make the regu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Network Science and Engineering
سال: 2023
ISSN: ['2334-329X', '2327-4697']
DOI: https://doi.org/10.1109/tnse.2022.3195350